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' Abstract. Trevisan has shown that constructions of pseudo-random generators from hard functions 

CN , (the Nisan-Wigderson approach) also produce extractors. We show that constructions of pseudo-random 

generators from one-way permutations (the Blum-Micah-Yao approach) can be used for building extrac- 
tors as well. Using this new technique we build extractors that do not use designs or polynomial-based 
error-correcting codes and that are very simple and efficient. For example, one extractor produces each 
Q> I output bit separately in O(log^n) time. These extractors work for weak sources with min entropy An, 

for arbitrary constant A > 0, have seed length O(log^n), and their output length is « n^^^. 
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1 Introduction 



Extractors are procedures that remedy an imperfect source of random strings. They have been the object 
of intense research in the last years and several relevant techniques have been developed. This paper 
^ I puts forward a new framework for constructing extractors based on a new connection between extractors 

■ and pseudo-random generators. Surely, in some regards, there are obvious similarities between the two 

Q . concepts. A pseudo-random generator takes as input a short random string called the seed and outputs 

a long string that cannot be distinguished from a truly random string by any test that is computable by 
circuits of bounded size. An extractor has two inputs: (a) The first one comes from an imperfect (i.e., 
. with biased bits and correlations among bits) distribution on binary strings of some length and it is 

I called the weakly-random string; (b) the second one is a short random seed. The output is a long string 

J-^ ' that cannot be distinguished from a truly random string by any test. One difference between pseudo- 

random generators and extractors is the number of inputs (one versus two). From a technical point of 
^ i view this difference is minor because the known constructions of pseudo-random generators implicitly 

^ ' do use an extra input which is a function that in some sense is computationally hard. The fundamental 

difference is in the randomness requirement for the output. Thus, while the output of a pseudo-random 
generator looks random in a complexity-theoretic way, the output of an extractor is random (or very 
close to random) in an absolute information-theoretic way. Consequently pseudo-random generators 
and extractors appear to belong to two very different worlds, and, for many years, the developments in 
the construction of pseudo-random generators and extractors went along distinct research lines. 

Trevisan [TreOlj has made a breakthrough contribution in this area by observing that the (apparently 
superficial) similarity between extractors and pseudo-random generators extends to some of the methods 
to build the two kind of objects. For the reasons mentioned above, Trevisan's result has been extremely 
surprising. It has also been an isolated example of a transfer from the complexity theory standard arsenal 
of techniques to the information theoretical area. In this paper we extend Trevisan's observation and 
establish that, as far as construction methods are concerned, there is a truly close relationship between 
pseudo-random generators and extractors. Specifically, we show that the other major route (than the 
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one followed by Trevisan) that leads to pseudo-random generators (of a somewhat different kind) can 
also be used to construct extractors. Some explanations are in order at this point. 

There are two known approaches for constructing pseudo-random generators. One approach uses 
as a building block a hard function / and, in one typical setting of parameters, for any given £ N, 
builds a pseudo-random generator g with outputs of length n that is secure against adversary tests 
computable in time . The running time to compute g{x) is , for some k' > k. This kind of 
pseudo-random generators can be used for derandomizing BPP computations. They cannot be used in 
cryptography, because in this setting, it is unwise to assume that the adversary is endowed with less 
computational power (n^) than the legitimate users {n^ ). Henceforth we will call this type of pseudo- 
random generator a "derandomization pseudo-random generator" (also known as a Nisan-Wigderson 
pseudo-random generator). 

The second approach uses as a building block a hard object of a more sophisticated type, namely 
a one-way function (the hardness of such a function / consists in the difficulty to invert it, but / 
must satisfy an additional property, namely, it should be easy to calculate f{x) given x). It is known 
that given a one-way function, one can construct a pseudo-random generator |HILL99j . An easier 
construction produces a pseudo-random generator from any one-way length-preserving permutation. 
This second approach has the disadvantage that is using as a building block a more demanding type of 
object. The advantage of the method is that a pseudo-random generator g constructed in this way can 
be used in cryptography because g{x) can be calculated in time significantly shorter than the time an 
adversary must spend to distinguish g{x) from a truly random string. Henceforth we will call this type 
of pseudo-random generator a "crypto pseudo-random generator" (also known as a Blum-Micali-Yao 
pseudo-random generator). 

Trevisan has shown that the known methods for constructing derandomization pseudo-random gen- 
erators also produce extractors. More precisely, he has shown that the constructions of pseudo-random 
generators from hard functions given by Nisan and Wigderson |NW94j and Impagliazzo and Wigder- 
son |IW97j can be used almost directly to produce extractors. His method has been extended in a 
number of papers to build extractors with increasingly better parameters (see the survey paper by 
Shaltiel |Sha02j ). In the paper |Tre99j . the conference version of [TreOlj . Trevisan has suggested that 
the methods to construct crypto pseudo-random generator cannot be used to build extractors. We show 
that in fact they can, at least for a combination of parameters that, even though not optimal, is not 
trivial. Moreover, we show that the extractors constructed in this way are very simple and efficient. 

An extractor can be viewed as a bipartite graph and is therefore a static finite object that can be 
constructed trivially by exhaustive search. We are looking however for efficient constructions. Typi- 
cally "efficient" means "polynomial time," but one can envision different levels of efficiency and one 
remarkable such level would certainly be "linear time." The first extractor built in this paper follows 
almost directly the classical construction of a pseudo-random generator from a one-way permutation, 
and comes very close to this level of efficiency: Viewed as a procedure, it runs in 0(n log n) time (in the 
standard RAM model). In addition it is very simple. The following is a complete description of it. The 
input consists of the weakly-random string X, of length n = h2^ for some integer n, and of the seed 
((xi, . . . , Xf), r), with \xi\ = n, i = 0(n), and \r\ = £h. We view X as a function X : {0, 1}*^ — > {0, 1}", 
and, using the standard procedure, we transform X into a circular permutation R : {0, 1}" — > {0, 1}". 
For i = Otom— 1 = n^^^\ we calculate bi as the inner product modulo 2 of r and (i?*(xi) . . . i?*(x^)). 
The output is 6o • • • ^m-i- 

Another remarkable level of efficiency which has received a lot of attention recently is that of sublinear 
time. It may be the case that in some applications we only need the i-th bit from the sequence of 
random bits that are extracted from the weakly-random string. We would like to obtain this bit in 
time polynomial in the length of the index i, which typically means polylog time in the input length 
(under the assumption that each input bit can be accessed in one time unit). By analogy with the case 
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of list-decodable codes, we call an extractor with this property, a bitwise locally computable extractor. 
The second extractor that we build is of this type. The algorithm deviates from the direct construction 
of a pseudo-random generator from a one-way function. However it relies on a basic idea used in the 
construction of the first extractor, combined with the idea of taking consecutive inputs of the hard 
function as in the extractor of Ta-Shma, Zuckerman and Safra |TSZSfll] . This second extractor is even 
simpler and its complete description is as follows. The input consists of the weakly-random string X of 
length n = h ■ 2"', for some natural number n, and of the seed ((xi, . . . ,Xi),r), with \xi\ = h, for all i, 

1 = 0{n), and |r| = In. We view X as the truth-table of a function X : {0, l}*^ — > {0, 1}". For i = to 
m — 1 = n^^^\ we calculate bi as the inner product modulo 2 of r and {X{xi . . . , X{xe + i)), where 
the addition is done modulo 2". The output is 6o . . . bm-i- 

The parameters of the extractors constructed in this paper are not optimal. Both extractors that 
have been described above work for weak sources having min-entropy An, for arbitrary constant A > 0, 
use a random seed of length O(log^n), and the output length is approximately n^/^. A variant of the 
second extractor has seed length O(logn) (here, for simplicity, we assume that the extractor's error 
parameter e is a constant), but the output length reduces to 2*^^^^°^ "■\ 

Lu's extractor |Lu04j coupled with the constructions of designs from the paper of Hartman and 
Raz |HR03j can be seen to be also a bitwise locally computable extractor with parameters similar to 
those of our second extractor (note that the designs in jHR,n3j appear to imply extractors with seed 
length Q{log^ n)). Lu's extractor is using expander graphs and the designs from |HR03j need somewhat 
unwieldy algebraic objects. It seems to us that the extractors presented in this paper are simpler than 
all the extractors from the literature.^ At the highest level of abstraction, our extractors follow the 
"reconstruction paradigm" (see |Shafl2j ) typical to Trevisan's extractor and to its improvements |RRV99| 
ITSZSOlllSUOlj . The major differences are that our extractors avoid (1) the use of designs (in this respect 
they are similar to the extractors in (TSZSOl) and |SU01j ^. and, perhaps more strikingly, (2) the encoding 
of the weakly-random string with an error-correcting code having a good list-decoding property. Our 
extractors can be implemented very easily and are thus suitable for practical applications. For example, 
they can be utilized to generate one-time pad keys in cryptosystems based on the bounded-storage model 
(see the papers of Lu |Lu04j and Vadhan [Vad04j), or for constructions of error-correcting codes using 
the scheme in [TSZOlj (the extractors built in this paper are actually strong extractors — for definition 
see, for example 'Sha02| — as required by this scheme). They may also have theoretical applications in 
situations where the kind of efficiency achieved by our extractors is essential. 

2 Definitions 

Notations: x Q y denotes the concatenation of the strings x and y, \x\ denotes the length of the string 
X, and \\A\\ denotes the cardinality of the set A. We remind the standard definition of an extractor. Let 
n S N. Let X„, Yn be two distributions on S". The statistical distance between X„ and Yn is denoted 
^sta.t{Xn, Yn) and is defined by Astat(-'^n, i^n) = max^c{o,i}" |Pi'ob(X„ G A) - Prob(y„ G ^)|. 

If we view the sets A C {0, 1}" as statistical tests, then, by the above expression, Astat(-'^n) ^) ^ e 
signifies that no test can distinguish between the distributions X„ and Yn except with a small 
bias e. If we restrict to tests that can be calculated by bounded circuits, we obtain the notion 
of computational distance between distributions. Namely, the computational distance between 
Xn and Yn relative to size S is denoted Acomp,s(-^n; ^n) and is defined by Acomp,5(-'^n5 ^) = 

^The simpler name locally computable extractor is already taken by a different kind of efficient extractors, namely by 
extractors computable in space linear in the output length, see |Vad04l . |Lu04l . 

^We note that Dziembowski and Maurer |DM04| give a similarly simple construction of an object that is related to 
extractors. 
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max |Prob(C(X„) = 1) — Prob(C(l^) = 1)|, where the maximum is taken over all circuits C of size 
< S. Abusing notation, we identify a circuit C with the set of strings x for which C{x) = 1. Thus, 
X £ C is equivalent to C(a;) = 1. 

The min-entropy of a distribution is a good indicator of the degree of randomness of the distribution. 
The min-entropy of a random variable taking values in {0, 1}" is given by min | log p^oh(x=a) ^ ^ 

{0,l}",Prob(X = a) / o}. 

Thus if X has min-entropy > k, then for all a in the range of X, Prob(X = a) < 1/2'^. For each 
n E N, let Un denote the uniform distribution over {0, 1}". We are now ready to define an extractor 
formally. 

Definition 2.1 (Extractor) The values n,k,d,m are integer parameters, and e > is a real number 
parameter. A function E: {0,1}" x {0,1}^^ — > {0,1}™ is a {k,e)- extractor if for every distribution X 
on {0, 1}" with min-entropy at least k, the distribution E(X, Ud) is e-close to the uniform distribution 
Um in the statistical sense, i.e., Astat{E{X,Uii),Um) < £• 

Thus, an extractor has as input (a) a string x produced by an imperfect source with distribution X, 
where the defect of the distribution is measured hy k = min-entropy (X), and (b) a random seed y 
of length d. The output is E(x,y), a string of length m. The key property is that, for every subset 
W C S™, 

|Prob^ex{o,i}",ye{o,i}''(^(^'y) ^ W) - Prob.es'-l^ e W)\ < e. (1) 

If we consider n and k as given (these are the parameters of the source), it is desirable that d is 
small, m is large, and e is small. It can be shown nonconstructively that for every k < n and e > 0, 
there exist extractors with d = log(n — k) + 21og(l/e) + 0(1) and m = k + d — 21og(l/e) — 0(1). It has 
been shown |RTS00j that these parameters are optimal. Furthermore, we want the family of extractors 
to be efficiently computable. For simplicity, we have defined individual extractors. However, implicitely 
we think of a family of extractors indexed by n and with the other parameters being uniform functions 
of n. In this way we can talk about efficient constructions of extractors by looking at the time and 
space required to calculate E{x, y) as functions of n. 

An extractor E: {0, 1}" x {0, 1}"^ ^ {0, l}*" can also be viewed as a regular bipartite graph where 
the set of "left" nodes is Vieft = {0, 1}" and the set of "right" nodes is Kight = {0, 1}™. The degree of 
each node in Vjeft is 2'^, and two nodes x G Vjeft and z € Kight are connected if there is y G {0, 1}*^ such 
that E{x, y) = z. We can imagine that each x G Vleft = {0, 1}" is throwing 2'^ arrows at 14ight = {0, 1}™. 

To understand better Equation (Q), let us look deeper into the structure of an extractor. We fix 
parameters n,d,m and e and a function E: {0, 1}" x {0, l}'^ {0, 1}™. Let us consider an arbitrary 
set W C {0, 1}*" and a string x E {0, 1}". We say that x hits W e-correctly via E if the fraction of 
outgoing edges from x that land in W is e-close to the fraction ||VF||/||{0, 1}™||, i.e.. 



[E{x,y) \y e {0,lY}nW\\ \\W\ 



mm \\{o,iy 



< e. 



If we look at a fixed x, it cannot hold that for every W C {Q, 1}"*, x hits W e-correctly (for example, 
take W = {E{x,y) \ y G {0, l}'^}). Fortunately, for E to be an extractor, all we need is that any 
W C {0,1}™ is hit e-correctly by most x G {0,1}". The folowing lemma has appeared more or less 
explicitly in the literature (see, for example, |Sha02j ). 

Lemma 2.2 Let E: {0, 1}" x {0, 1}"* {0, 1}™ and e > 0. Suppose that for every W C {0, 1}™, the 
number of x £ {0, 1}" that do not hit W e-correctly via E is at most 2*, for some t. Then E is a 
{t + log(l/e), 2e) -extractor. 
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Proof Let X be a distribution on {0, 1}" with min-entropy at least t + log(l/e) and let be a subset 
of {0, 1}*". There are at most 2* x's that do not hit W e-correctly and the distribution X allocates to 
these x's a mass probability of at most 2* • 2~(*+'°s(-^/'^)) = e. We have, 

Prob^6x{o,i}",j/e{o,i}'*(^(^'y) ^ ^) 

= Pi'ob^ejf{o,i}",j/G{o,i}''(-^(^' y) ^ ^ ^'^"^ ^ ^its ^ e-correctly) 

+ Pi"ob^6x{o,i}",s/e{o,i}'*(-^(^' y) ^ ^ ^ does not hit W e-correctly). 



The first term in the right hand side is between j^Qjj 
hits W e-correctly, 



The second term is bounded by 



e and 



\W\ 



\\{o,ir 



\m 

Il^ll 

''WW 



+ e, because for each x that 



+ e 



Prob2.g_^|o^i}n j,£|o^i}d(a; does not hit W e-correctly), 

which is, as we have seen, between and e. Plugging these estimates in the above equation, we obtain 
that 

^''oKex {0,1}-, yezo4E{x,y) e W) - ^^^^ ^^^^^ < 2e. 

Thus, is a (t + log ^, 2e)-extractor. | 
We recall the definition of a pseudo-random generator. 

Definition 2.3 (Pseudo-random generator) Let i,L, S £ N and e > be parameters. A function 
g: T,'- —>■ is a pseudo-random generator with security [e, S) if ^comp,s{g{U£),UL) < e. 



3 Overview and comparison with Trevisan's approach 

Trevisan's method is based on the constructions of pseudo-random generators from hard functions given 
in |NW94j and in |IW97j . These constructions use a function / as a block-box and construct from it 
a function gj that stretches the input (i.e., |(7/(x)| >> |x|) and which has the following property. If 
there exists a circuit D that distinguishes gf{x), when x is randomly chosen in the domain gf, from 
the uniform distribution, then there is a small circuit A, which uses D as a subroutine, such that A 
calculates / (or an approximation of /, depending on whether we are using the method in |IW97j or the 
one in |NW94j ) . Therefore if / is a hard function, there can be no circuit D as above of small size and 
thus gf is a pseudo-random generator. Trevisan has observed that (1) the truth-table of / can be viewed 
as a string produced by a weak source that can serve as an extra input of the pseudo-random generator, 
and (2) the circuit A invoking D can be considered as a special type of a circuit that is endowed with 
D-gates. By a standard counting argument, it can be shown that, for any circuit D, regardless of its 
size, the set of functions that can be calculated by small circuits with D-gates is small. A circuit D can 
be viewed statically as a statistical test (more exactly, the statistical test associated to the circuit D is 
the set of strings accepted by D). In the new terminology, the fact that D distinguishes the distribution 
of gf{x) from the uniform distribution with e bias can be restated as "/ does not hit D e-correctly 
via g." The main property mentioned above can be restated as saying that the set of functions / that 
do not hit D e-correctly is included in the set of functions computable by small circuits with D-gates. 
Since the latter set is small, the former set is small as well, and thus, by Lemma 12.21 the construction 
yields an extractor. In a nutshell, Trevisan's method replaces hard functions (a complexity-theoretic 
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concept) with random functions (an information-theoretic concept) and takes advantage of the fact that 
a random function is hard and thus the construction carries over in the new setting. 

We would hke to fohow a similar approach for the construction of crypto pseudo-random generators 
from one-way permutations. Those constructions do use a one-way permutation R as a black box to 
construct a pseudo-random generator gji, and thus a truth-table of R can be considered as an extra 
input of the pseudo-random generator. Also, the proof is a reduction that shows that if a circuit D 
distinguishes gR{x) from the uniform distribution, then there is a small circuit A, invoking the circuit D, 
that inverts i? on a large fraction of inputs. To close the proof in a similar way to Trevisan's approach, 
we would need to argue that the vast majority of permutations are one-way. It seems that we hit a 
major obstacle because, unlike the case of hard functions, it is not currently known if even a single 
one-way function exists (and we are seeking an unconditional proof for the extractors that we build). 
We go around this obstacle by allowing algorithms to have oracle access to the function they compute. 
Thus, in the above analysis, the circuit A, in addition to invoking the circuit D, will also have oracle 
access to the permutation R. In this setting all permutations are easy to compute because, obviously, 
there is a trivial constant-time algorithm that, for any permutation R : {0,1}"" — > {0,1}"", given the 
possibility to query R, calculates R{x). We need to argue that only few permutations R are invertible 
by algorithms that can query i? in a bounded fashion. More precisely we need to estimate the size of 
the set of permutations R : {0, 1}" {0, 1}" that can be inverted on a set of T elements in {0, 1}" by 
circuits that can pose Q queries to R. This problem has been considered by Impagliazzo |Imp96[ and 
by Gennaro and Trevisan (CTflO . Their techniques seem to work for the case T • Q < 2"" and lead to 
extractors that work only for sources with high min-entropy.^ 

We obtain better parameters by restricting the type of one-way permutations and the type of circuits 
that attempt to invert them. A second look at the standard construction of Blum-Micali-Yao pseudo- 
random generators reveals that the circuit A with D-gates manages to determine x using only the values 
R{x), i?^(x), . . . , R^{x) (where m is the generator's output length). It is thus enough to consider only 
circuits that use this pattern of queries to the permutation R. Intuitively, for a random permutation 
R, the value of x should be almost independent of the values of R{x) , R"^ (x) , . . . , R^ (x) , and thus, 
a circuit A restricted as above cannot invert but a very small fraction of permutations. If we take 
i? to be a random circular permutation, the above intuition can be easily turned into a proof based 
on a Kolmogorov-complexity counting argument. A circular permutation R : {0, 1}" — > {0, 1}" is 
fully specified by the sequence (i?(l), i?^(l), . . . , R'^~^{1)), where A'" = 2". If a circuit A restricted as 
above inverts R{x) for all x, then the permutation R is determined by the last m values in the above 
sequence, namely R^-^'il), R^-^"'-^^!), . . . , R^~^{1). Indeed, given the above values, the circuit 
A can determine i?^-'"-i(l), which is R-^{R'^-"'{1)), and then i?^-'"-2(l), and so on tih R{1) is 
determined. Therefore such a permutation R, given the circuit A, can be described concisely using only 
m ■ n bits (for specifying, as discussed, the last m elements in the above sequence). In fact, in our case, 
the circuit A does not invert R{x) for all x G {0, 1}", and, therefore, the values of R at the points where 
the inversion fails have to be included in the description. A further complication is that even for the 
successful cases, the circuit A only list-inverts R{x), which means that A on input R{x) produces a 
relatively short list of elements, one of which is x. Thus, one also has to include in the description of 
R the rank of x in the list produced by A. The quantitative analysis of the standard construction of a 
crypto pseudo-random generator shows that if the permutation R does not hit D e-correctly, then the 
circuit A with D-gates is only able to produce for an e/m fraction of R{x),x G {0, 1}", a list with m?/e'^ 
elements one of which is x. For interesting values of m (the pseudo generator's output length), the e/m 
fraction is too small and needs to be amplified to a value of the form (1 — 6), for a small constant 5. This 

■^On the other hand, these extractors have the interesting property that their output looks random even to statistical 
tests that have some type of access to the weakly-random string. These results will be reported in a separate paper. 
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can be done by employing another technique that is well-known in the context of one-way functions. 
Namely, we use Yao's method of converting a weak one-way function into a strong one-way function by 
taking the direct product. In other words, we start with a circular permutation i2, define (the direct 
product) R{xi, . . . ,Xi) = R{xi) ... R{x() (where denotes concatenation), for some appropriate 
value of and use R in the definition of the extractor (instead of R in our tentative plan sketched 
above). It can be shown that, for I = 0{{l/5) log(l/7)), if a circuit A list-inverts (yi, . . . , y^), with list 
size T = m^/e^, for a 7 = e/m fraction of ^-tuples (yi, . . . , y^) S ({0, 1}")^, then there is a probabilistic 
algorithm A' that list-inverts R{x) with list size 0{n • T • (1/(5) • (I/7) • log(l/7)) for a (1 — (5) fraction of 
X G {0, 1}". By fixing the random bits and the queries that depend on these random bits, we can obtain 
a brief description of R as in our first tentative plan. It follows that only few permutations R can hit 
D e-incorrectly and, therefore, by Lemma l2.2| we have almost obtained an extractor (we also need to 
convert an arbitrary function X : {0, 1}" — > {0, 1}" into a circular permutation R : {0, l}" — > {0, 1}", 
which is an easy task). 

Briefly, the proof relies on the fact that if a permutation R does not hit D e-correctly, then there 
must be a very strong dependency between the "consecutive" values x, R{x),R^{x), . . . , i?™'(x), for many 
X G {0, 1}", and only few permutations R exhibit such dependencies. 

The second extractor starts from this idea and the observation that, for the sake of building an 
extractor, we can work with a function X (i.e., not necessarily a permutation) and consider consecutive 
values X(x), X(x + 1), . . . , X{x + m), as in the extractor of Ta-Shma, Zuckerman, and Safra |TSZSOlj . 
That extractor (as well as all the extractors using the "reconstruction paradigm") takes X to be the 
encoding of an arbitrary function X with a good list-decoding property and some other special algebraic 
properties. This is necessary, among other things, for the same type of amplification as in our discussion 
above. We use instead a direct-product construction that is much simpler to implement (however, the 
cost is a longer seed length). 

4 Restricted permutations, restricted circuits 

The space from where we randomly choose permutations consists of permutations of a special form. 
First we consider the set CIRC of all circular permutations R : {0,1}'^ — > {0,1}". Next, for some 
parameter £ G N, we take the ^-direct product of CIRC. This means that for any R G CIRC, we define 
Rii : {0, 1}^" ^ {0, 1}^" by Ri{xi X2 . . . Xf ) = R{xi) R{x2) ... R{x(). We let PERM^ be the 
set I R G CIRC}. We will drop the subscript i when its value is clear from the context or when it 
is not relevant in the discussion. 

We want to argue that no circuit that queries i? in a restricted way can invert a "large" fraction of 
R{x) except for a "small" fraction of permutations R in PERM. In order to obtain adequate values for 
"large" and "small" we will impose the following restriction on the pattern of queries that the circuit 
can make. 

Definition 4.1 An oracle circuit C on inputs of length at least i ■ n is L-restricted if on any input x 
2 _L— 1 

and for all oracles R G PERM^, C only queries Xfirst, R{xfirst)-, R (a^first), ■ ■ ■ ,R (a^first); where xgrst is 
the string consisting of the first i ■ n bits of x. 

We will allow the circuits to attempt to invert 72 in a weaker form: On input R{x), outputs a 
small list of strings one of which (in case C succeeds) is x. When this event happens, we say that 
list-inverts x. We are interested in estimating the number of permutations R G PERM so that 
list-inverts R{x) for a large fraction of x. 

Definition 4.2 Let C be an oracle circuit. A permutation R is {'y,T)-good for C if for at least a 7 
fraction ofx£ {0, 1}^"", on input R{x) outputs a list ofT elements that contains x. 
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We will show that a permutation that is (7, T)-good for a restricted circuit C admits a short description 
conditioned by C being given. This leads immediately to an estimation of the number of permutations 
R that are (7, T)-good for a given restricted circuit C. 

Lemma 4.3 Zet 7 > 0, n G N, L G N, and T e N. Let N = 2". Let 6 > and let ^ = [| • log (|)] . 
Assume 6 > 2e~'^ and i < L + 1. Let C be an L-restricted circuit, having inputs of length in, and let 
R G PERM^ be a permutation that is (7, T)-good for C. Then, given C and I, R can be described using 
a number of bits that is bounded by 26Nn + Ln + A^logn + (log6)A'' + Nlog{l/6) + log log(2/7) + 
iVlog(l/7) + iVlogT + 18n2 . L • i • log 

Proof Since R is the ^-direct product of R, it is enough to present a short description of R. We will 
first show that the assumption that list-inverts a 7-fraction of R{x) with x G {0, 1}^" implies that 
there exists an oracle circuit B so that list-inverts a {1 — 6) fraction of R{x) with x G {0, 1}". The 
circuit B is not L-restricted but it has a similar property. Namely the circuit B makes two categories 
of queries to the oracle R. The first category consists of a set of queries that do not depend on the 
input. The second category depends on the input y and it consists of the queries y, R{y), . . . , R^~^{y). 
The circuit B is helpful in producing the concise description of R that we are seeking. Note that the 
permutation R G CIRC is determined by the vector (i?(l), . . . , This vector will be 
described in the following way. The last L entries are described by themselves. Then we describe each 
of the other entries y one at a time going backwards in the vector. Suppose that R{y),R^{y), . . . , R^{y) 
are already described. We describe now the preceding term in the sequence, which is y. There are two 
cases. 

Case 1: B^ list-inverts R{y). In this case y is determined by its rank in the T-list produced by B^ 
on input R{y). The computation of B^ on input R{y) depends on the strings R{y), ■ ■ ■ , R^{y) (which 
are already described) and on the value of R on the fixed queries (these values have to be given in the 
description, but they are common to all the entries in the vector). 

Case 2: If B^ fails to list-invert y (this will happen only for a small fraction 5 of y's), then y is 
described by itself. 

We will show that this description policy needs the asserted number of bits. 

We proceed with the technical details. The amplification of the fraction of inverted inputs from 7 
to {1 — 5) is done using the well-known technique of producing strong one-way functions from weak 
one-way functions (Yao |Yao82j 'l. Let = 6 • i • log (2/7) • i. Recah that i= [| • log (|)] . It holds 

that, for n > ln(2/5) and 5 < 1/3, ^<-^-{l-5 + e"")^ Let INV be the set of strings ^(x) on which 

outputs a T-list that contains x. From the hypothesis, we know that ||INV|| > 7 • 2^". We define 
the following probabilistic algorithm D. 

Input: y = R{x), for some x G S". Goal: Find a short list that contains x. 
LIST = 0. 

Repeat the following n ■ w times. 
Pick random i G {1, ...,£}. 

Pick £ — 1 random strings in {0, l}" denoted yi, . . . , yi-i, yi+i, ■ ■ ■ ,ye- 
Calculate Y = yi Q . . . Q yi_i R{x) y^+i ... y^. 

Call the circuit to invert Y . returns a T-list of ^-tuples in ({0, 1}")^. 
(Note: In case of success one of these ^-tuples is 

{R-\yi),...,R-Hyi-i),x,R-Hyi-i),---,R~Hye))-) 

Add to LIST the i-th component of every £-tuple in the list produced by C^. 
End Repeat 
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Wc say that the above algorithm is successful on input y = R{x) if, at the conclusion of the algorithm, 
LIST contains x. Wc estimate the success probability of the above circuit on input y = R{x). 

Let N(y) be the multiset of ^-tuples having y as one component where the multiplicity of a tuple is 
the number of occurrences of y in the tuple. For a set A C {0, l}'^, we define N{A) = U^eA^ly)- 
can be seen that, for all y G {0, 1}", ||A^(y)|| = i ■ 2"(^-i). We define 



\my)\\ -w. 

Let Vyj he the complement of Vyj. We have 

||A^(K;) n INVII < ^ ||A^(y)nlNV|| 

w 



w 



II (S^ 



We show that this is possible only if \\Vw\\ < (5 — 6"") • ||S"||. Let ^ C S" be a set with ||^|| > ((5 — e") • 
||S"||. We observe that N{A) covers an overwhelming fraction of (S")^. Indeed, note that the probability 
that a tuple {yi, . . . , yi) is not in N{A) is equal to the probability of the event "yi ^ A A ... A yi ^ A" 
which is bounded by (1 — (5 + e~^Y. Therefore, the complementary set of N{A), denoted N{A), satisfies 

IliVpJll < (l-<5 + e-'^)^- ||(E")^||. 

Then, 



||iV(A) nINVll = ||INV|| - ||INVniV(y4)|| 
> ||INV|| - \\N(A)\\ 



in 



RecaU that £/w < [y - {I - 6 + e"")^] . Thus necessarily ||T4,|| < {6 - e"") • 2 

On input y = R{x), at each iteration, the algorithm chooses uniformly at random y in N{y). The 
circuit C is invoked next to invert R{y). The algorithm succeeds if and only if y G INV. For all y G K 



Wl 



^^^\Wy)\\^^^ — ' ^^^^ probability that one iteration fails conditioned by y G 14; is < (1 — (l/w)). 
Since the procedure does n ■ w iterations, the probability over y G {0, 1}" and over the random bits 
used by the algorithm L>, conditioned by y G Vyj, that y is not list-inverted is < (1 — (l/w))"'"^ < e~"'. 
Therefore the probability that y is not list-inverted is bounded by the probability that y ^ Vw plus the 
above conditional probability of failurc-to- list-invert. Thus, it is bounded by 5 — -|- e""" = S. 

Note that the algorithm D is using at each iteration the random strings yi, . . . , yj_i, yj+i, . . . , y^ and 
there are n • w iterations. There is a way to fix these random strings used by D so that the circuit B 
that is obtained from D by using the fixed bits instead of random bits list-inverts a fraction of at least 
(1 — S) of the strings x G {0, 1}". There are n ■ w ■ {£ — 1) fixed strings. 

Assuming that the circuit C and £ are given, the permutation R can be described, using the 
previously-discussed procedure, from 

• 26 ■ N ■ n bits that encode the dN elements that B fails to list-invert and the value of R at these 
points. 

• The last L positions in the circular permutation R. This requires L ■ n bits. 
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• For each of the {1 — 6)N strings x that are hst-inverted by C, the rank of x in the generated LIST. 
This requires (1 — 5) ■ N ■ (log n + log w + log T) bits. 

• The set of n • u; • — 1) fixed strings y and the value of R on y, R{y), ■ ■ ■ , R^~^{y) for every fixed 
y. This requires n-w-{£ — l)-n + n- w-{i — l)-L-n< n'^wiL bits (for £ — 1 < L). 

The total number of bits needed for the description (given B) is bounded by 

26Nn + Ln + {l-6)Nlogn + {l-6)N log w + {1 - 6)N logT + n^wiL. 

Plugging the values of £ and w, we obtain that the description of R is bounded by 26Nn + Ln + Nlogn + 
(log 6)N + N log(l/(5) + N log log(2/7) + N log(l/7) + TV log T + ISn^ . L • i • (i) ' ( log | 

We want to estimate the number of permutations that are (7, T)-good for some L-restricted circuit 
C. We state the result for a particular combination of parameters that will be of interest in our 
application. The extractor construction will involve the parameters m S N and e > 0. We will have 
7 = e/m, T = m? ■ (1/e^), and L = m. 

Lemma 4.4 Let n e N, m G N, e > 0, 5 > 0. Let N = 2". Consider 7 = e/m and T = ■ (1/e^). Let 
£ = 1(3/5) log(2/7)] . Assume that 5 = 0(1) and rr? ■ (1/e) = o{N/n'^). Let C be an m-restricted circuit, 
with inputs of length £n. Then the number of permutations R in PERM^ that are {^,T)-good for C is 
bounded by 2^, where h = 36 ■ N ■ n + SNlogm + 3A^log(l/e). 

Proof Under the assumptions in the hypothesis, Lemma 14.31 implies that any permutation that is 
(7, T)-good for C can be described with h bits. The conclusion follows immediately. | 

5 Analysis of the construction of pseudo-random generators from 
one-way permutations 

We recall the classic construction (Blum and Micali |BM84j and Yao |Yao82j ) of a pseudo-random 
generator from a one-way permutation. The construction starts with a one-way permutation 
R : {0, l}^" {0,1}^". In the classical setting, we work under the assumption that no circuit 
of some bounded size inverts R{x) except for a small fraction of x in the domain of R. 

Step 1. We consider the predicate h : {0,1}^"' x {0,1}^"' — > {0,1} defined by b{x,r) = x- r (the 
inner product modulo 2). By the well-known Goldreich-Levin Theorem |(irL89j . b{x,r) is a hard-core 
predicate for R{x) r, i.e., no circuit of an appropriate bounded size can calculate b{x, r) from R{x) r 
except with a probability very close to 1/2. More precisely, it holds that if a probabilistic circuit Ci on 
input R(x) r calculates b(x,r) with probability 1/2 + e (the probability is over x, r, and the random 
bits used by Ci) then there is a circuit C2 not much larger than Ci which for a 3e/4 fraction of x 
list-inverts x. (In the classical setting this is in conflict with the above assumption, because one can 
check the elements from the list one by one till x is determined.) Lemma [5^ proves this fact adapted 
to an information-theoretic context (actually, in our setting, the fact holds with stronger parameters). 

Step 2. The function % : {0, l}^^" {0, l}2^"+i, given by Hj^{x,r) = R{x) r b{x,r), can be 
shown to be a pseudo-random generator with extension 1. More precisely, it holds that if C2 is a circuit 
that distinguishes H^{x,r) from U2£n+i with bias e, one can build a circuit C3, not much larger than 
C2, that on input R(x) r calculates b{x,r) correctly with probability at least 1/2 + e. Lemma 15.21 
proves this fact adapted to an information-theoretic context. 

Step 3. We define G^(x, r) by the following algorithm. 
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Input: R a permutation of {0, 1}^", x e {0, 1}^", r G {0, 1}^". 

For i = to m — 1, bi = r ■ {R^(x)). 
Output bo Q bi Q . . . Q bm-i- 

It can be shown that under the given assumption, G-^ is a pseudo-random generator. More precisely, 
it holds that if a circuit C4 distinguishes from Um with bias e, then there is a circuit C3, not much 
larger than C4, so that C3 distinguishes H^{x,r) from C/2to+i with bias at least e/m. Lemma l5 . 1 1 proves 
this fact adapted to an information-theoretic context. 

We need to establish the properties of the above transformations (Steps 1, 2, and 3) in an information- 
theoretic context because they will be used for the construction of an extractor. In our setting R : 
{0, 1}^" — > {0, 1}^" is a random permutation and C4 is a statistical test. We will show that there are 
some circuits Ci^i, . . . , Ci^2'"+i-4 such that if R does not hit C4 e-correctly via G, then R is (e/m, m^/e^)- 
good for some Ci^j, and thus, by the results in the previous section, R has a short description. In 
our context, the size of the different circuits appearing in Steps 1, 2, and 3 will be considered to be 
unbounded. What matters is the number and the pattern of queries, i.e., the fact that C3, C2 and 
Ci are restricted circuits. This is an informatic-theoretic feature. The following lemmas follow closely 
the standard proofs, only that, in addition, they analyze the pattern of queries made by the circuits 
involved. 

Lemma 5.1 (Analysis of Step 3.) For any circuit C4 there are 2^~^ — l circuits Cs^i, 6*3,2, • • • , C^ 2'^-i_i 
such that: 

(1) If R is a permutation with 

\Pio\>s,r{Gji{x,r) £ C4) - Prob(C/m G ^4)! > e, 
(i.e., R does not hit C4 e-correctly via G), then there is i £ {1, . . . , 2"^~^ — 1} such that 

|Prob(%(C/M,^7;j G C3^J -Prob(?72,n+i G 63^,)! > ^. 

(2) All the circuits C^^i are (m — 2) -restricted. 

Proof For k G {0, . . . , m — 1}, we define the distributions 

dk = UkQ {Uen ■ U',n) {R{Uin) •[/;„) ... (ii™"'"'(C/fa) • ?7;„), 

where U^, Uin-, and U'^^ are distinct instances of the uniform distributions on {0,1}'^, {0,1}^", and 
{0, 1}^", respectively. Suppose that a permutation R satisfies 

Vio\>s^r{Gji{x, r) G C4) - Prob([/m G C4) > e. (2) 

In the new notation, the above reads Prob((io G C4) — Vioh{dm-i G C4) > e. This implies that there 
is some k G {0, ...,m - 2} such that Prob((ifc G - Prob(cifc+i G C4) > e/m. For zi G {0,1}^", 
Z2 e {0, 1}^", Z3 G {0, 1}, we define 

/(^i ^2 23) = 23 (21 • -^2) (^(^i) ■Z2)Q...Q{R {zi) ■ Z2). 

Note that 

dk = UkQ f{R{Uen) f/L (t^fe • U'eJ) 
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and 

4+1 = UkQ f{Uin [/fa Ui). 

We define the following circuit D that is able to distinguish H^{Uin, U'^^) from U2in+i- The input of D is 
a string y G {0, l}2^"+i^ which we break into y = yi0?/20y3; with yi and y2 in {0, 1}^", and 7/3 S {0, 1}. 
The circuit D on input y € {0, l}2^"+i^ chooses a fc-bits long string u, calculates /(yi, 2/2, 2/3) using the 
oracle i? and simulates C4 on input n0/(yi, 7/2, ys)- Note that the calculation of /(yi, ^2, ys) requires at 
most the query of the strings yi,R{yi), . . . , i?™^^(yi). Thus, D is an (m — 2)-restricted circuit. Clearly, 
Prob„,j,(y G Z?^) = Prob(dfc+i G C4) and Prob„,5,j.(f/^(^, r) E Z)-^) = Prob(4 S C4). Therefore, 
ProbM^x,r(-f%(^, J^) £ -D^) — Pi'obM^j,(y € D'^) > e/m. By fixing in all possible ways A; G {0, . . . , m — 2} 
and then the /c-bits long string u, we obtain 2"^~^ — l circuits, denoted 6*3^1, . . . , C3 2™-!-!; that act like D 
except that the random bits are replaced by the fixed bits. The argument above shows that if R satisfies 
Equation ©, then there is one circuit Cs.j in the above set of circuits such that Frohx,r{H^{x,r) G 
^3i) ~ Pi'ot)y(y G C^-) > e/m. The circuits C^^i are (m — 2)-restricted circuits. With a similar proof, 
one can see that if 

Prob([/™ G C4) - Prob5,,(G^(x, r) G C4) > e, (3) 

then there is one circuit C^^i such that Proby(y G C^-) — Fvohx^riH-i^{x,r) G C^^) > e/m. | 

Lemma 5.2 (Analysis of Step 2.) Let C3 he an omcle circuit that is L-restricted, for some parameter 
L. There are four oracle circuits 6*2,1, 6*2,2 1 (^2,3, (^2,4 such that 

(1) If a permutation R satisfies 

|Prob^,,(%(x r) G Ci) - Prob(C/2fe+i G Cf)\ > e, 
then there is i £ {1,2,3,4} such that 

Probx,r(C2^i(^(x) r) = b{x, r)) > ^ + e. 

(2) The four circuits are L-restricted. 

Proof We define the oracle circuit B that on input R{x) r runs as follows. It chooses a random bit 
u and then it simulates the circuit to determine if R{x) Qr Qu belongs to or not. If the answer 
is YES, the output is u, and if the answer is NO, the output is 1 — u. We also define the circuit D in 
a similar way, with the only change that the YES/NO branches are permuted. Note that B and D are 
both circuits that are L-restricted. 

Recall that -f%(^, r) = R{x)QrQb(x, r). Let us suppose that for some permutation R, Probj,r(-R(x)0 
rQb{x,r)) G C|^)-Prob(C/2fa+i G C^) > e. Note that Prol>j,r.(^(x)0r0&(x, r)) G )-Prob(C/2fa+i G 
Cf) = (Pvoh^,r(R{x)QrQb{x,r)) G )-Prob[/,,j,^(i?(x)0r0C/i) G )) + (Probc/,,5,^(72(x)0r0[/i) G 

C^) — Prob([/2to+i £ C^)). The second term is equal to zero, because R is a permutation and, thus, 
UiQ R{x)Qr is actually the uniform distribution on {0, i}2^"+i. Therefore, Fvobx, r{R{x)QrQb{x,r)) G 
C^) — Probt/^,^,r(-R(a;) r Ui) G C-^) > e. According to Yao's lemma that connects predictors to 
distinguishers (for a proof see, for example, |Zim04[ pp. 162]), it follows that Piohu^x,r{B^ {R{x) Qr) = 
b{x, r)) > ^ + e. Let Bq (Bi) be the circuit that is obtained from B by fixing bit u to (respectively, to 
1). Then at least one of the events ''Bo(R{x)QR) = b{x,ry or ''Bi(R{x)QR) = b(^,rf has probability 
>i + e. 

If Prob2(2; G C^) — Prob^,r.(-R(x) r 6(x, r)) G C^) > e, then the same argument works for the 
circuit D, and we obtain two deterministic circuits Dq and Z?i. The four circuits Bq, Bi, Dq and Di 
satisfy the requirements. I 
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Lemma 5.3 (Analysis of Step 1.) Let C2 be an oracle circuit that is L-restricted for some parameter 
L. Then there is a circuit Ci such that 

(1) If R is a permutation such that Viohx^riC 2 {R{x)Qr) = h{x,r)) > i + e, then for at least a fraction 
eofxG {0, 1}^", Cj^ on input y = R{x) outputs a list 0/ strings that contains x (i.e., R is 
{e,l/e^)-goodforCi). 

(2) The circuit Ci is L-restricted. 

Proof Suppose permutation i! : {0,1}" {0,1}^" satisfies Probj;,r-(C^(^(x)0r) = h{x,r)) > (l/2) + e. 
Then, by a standard averaging argument, for a fraction e of x in {0,1}^", Prob^ (C|^(ii(x) r) = 
6(5;, r)) > (1/2) + (e/2). Consider such an x and let Had(3;) denote the encoding of x via the Hadamard 
error-correcting code (see |Tref)4j ) . By the definition of the Hadamard code, b{x,r) is just the r-th bit 
of Had(x). Thus the string u = C^{R{x) (0 ... 0)) ... C^(R{x) (1 ... 1)) agrees with Had(x) 
on at least a fraction (1/2) + (e/2) of positions. Since the circuit C2 is L-restricted, the string u can be 
calculated by querying only y, R(jj), . . . ,R ^(y), where y = R{x). By brute force we can determine 
the list of all strings z so that Had(z) agrees with u in at least | + f positions. It is known (see, for 

example, |Zim04l pp. 218]) that there are at most | • (-)^ = (i)'^ such strings z and one of them is 
X. ' ' I 

By combining Lemma l5. 11 Lemma 15.21 and Lemma 15.31 we obtain the following fact. 

Lemma 5.4 Let C4 be a circuit. Then there are 2"^~^^ — 4 circuits Ci,i, . . . , Ci 2^+1-4 such that 

(1) If R is a permutation with 

|Prob5,^(G;fj(^,r) G C4) - Prob(C/„ G C^)\ > e, 

(i.e., R does not hit C4 e-correctly via G), then there is some circuit Ci^i such that for at least a 
fraction ^ ofx, C^^ on input R(x) outputs a list of m? • (7)^ strings that contains x (i.e., R is 
{e/m,m^ /e^)-good for C^J. 

(2) All the circuits Ci,i are (m — 2) -restricted. 

6 An extractor from a crypto pseudo-random generator 

We first build a special type of extractor in which the weakly-random string is the truth-table of a 
permutation in PERM. 

The following parameters will be used throughout this section. Let e > 0, 5 > 0, and n, m E N be 
parameters. Let N = 2". Let i = [(3/5) log(2m • (1/e))]. We consider the set of permutations PERM^. 
We assume that 5 = 0(1) and m? ■ (1/e) = o{N/n^). 

Let G : PERM^ x ({0, 1}^" x {0, lY"-) {0, 1}"" be the function defined by the following algorithm 
(the same as the algorithm for G-j^ from the previous section). 



Parameters: G N, m E N. 

Input: R G PERM^, (x,r) G {0, 1}^" x {0, 1}^". 

For i = Otom — 1, bi = r ■ itix). 

Output 60 &i • • • "m— 1 • 

The following lemma, in view of Lemma 12.21 shows that G is an extractor for the special case of 
weakly-random strings that are truth-tables of permutations in PERM^. 
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Lemma 6.1 Let C4 he a test for strings of length m (i.e., C4 C {0, Ij^j. Let G00D(C4) = {i? e 
PERM^ I R does not hit C4 e-correctly via G}. Then ||G00D(C4)|| < 2"+^^+!, where h = 36Nn + 
3A^ logm + 3iVlog(l/e). 



Proof Let Ci^i, . . . , 2™+i-4 be the 2™+^ — 4 circuits implied by LemmaEUto exist (corresponding 
to the test C4). Let R be in G00D(C4). Then Lemma 15.41 shows that there is a circuit Ci^i from 
the above hst having the following property: For at least a fraction 7 = e/m of strings x G {0,1}^", 
C/^j on input R{x) returns a list having T = ■ (1/e^) strings, one of which is x. Thus, R is (7,T)- 
good for Ci^i (recall Definition 14.2(1 . It follows that the set of permutations R £ PERM^ that do not 

hit C4 e-correctly via G is included in Ui {R e PERM^ | R is (7,r)-good for Ci,,}. Lemma Ol 
shows, that, for each i G {1, . . . ,2™+i - 4}, \\{R G PERM^ | R is (7,r)-good for Ci / 
h = 36 ■ N ■ n + SNlogm + 3A^log(l/e). The conclusion follows. 



< 2^^, where 



In order to obtain a standard extractor (rather than the special type given by Lemma l(i.l|) . the only 
thing that remains to be done is to transform a random binary string X into a permutation R £ CIRC, 
which determines R G PERM^ that is used in the function G given above. 

Note that a permutation R G CIRC is specified by {R{1), R'^{1), . . . , R^-'^il)), which is an arbi- 
trary permutation of the set {2, 3, . . . , A'^}. Consequently, we need to generate permutations of the set 
{1, 2, . . . , — 1} (which can be viewed as permutations of {2, 3, . . . , A^} in the obvious way). We can 
use the standard procedure that transforms a function mapping [N — 1] to [A^ — 1] into a permutation 
of the same type. To avoid some minor truncation nuisances, we actually use a function X : [A^] [N]. 



Input: X -.[N]^ [N]. 

for i = 1 to Af - 1, R{i) = 
Loop 2: 

for i = 1 to Af - 1 

Y{i) = 1 + {X{i) mod i) 
Loop 3: 

for i = 1 to A^ - 1 

Swap R{i) with R{Y{i)). 
Output: permutation i? : [A^ — 1] - 



initially R is the identity permutation) 



[N-1]. 



We want to estimate the number of functions X : [A^] — > [A^] that map via the above procedure to 
a given permutation i? : [A'' — 1] ^ [A'' — 1] . We call a sequence (Y{1), . . . , l^(A^)) a *-sequence if, for 
all i, Y{i) G {1, . . . , i}. Observe that, using Loop 3, a *-sequence (1^(1), . . . , Y[N — 1)) defines a unique 
permutation (-R(l), . . . , R{N — 1)), and thus it is enough to estimate the maximum number of functions 
X : [N\ [N] that map via Loop 2 in the above procedure in a given *-sequence {Y{1), . . . , Y{N — 1)) 
(the maximum is taken over all *-sequences of length A^ — 1). We denote this number by A{N). A 
(rough) upper bound can be established as follows. 



A{N) < 



< 





-N- 




\ ^ 1 




~2 




A^- 1 



N 



{N + l){N + 2) ■.. 



N 



.N-1 
{2N - 1) 



+ 1 



1 • 2 • 

2N - 1 
N-1 



(AT-l) 



< 2 



2N 
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We can now present the (standard) extractor. We choose the parameters as follows. Fix n G N 
and let N = 2"- and N = n - 2"-. Let A G (0,1) be a constant. Let a > 0,/3 > be constants 
such that Q < A/3,/3 < (A - 3a)/4. Let e > N'^^ and m < A^". Take 6 = (A - 4/3 - 3a)/4 and 
i = [(3/5) log(2m • (1/e))]. The weakly-random string X has length N and is viewed as the truth- 
table of a function mapping [N] to [N]. The seed is of the form y = {x,r) G {0, 1}^" x {0, 1}^". We 
first transform X into a permutation R{X) G PERM^ using the above algorithm and then taking the 
^-product. We define the extractor E : {0, 1}^ x {0, l}^^" ^ {0, l}'" by E{X, {x, r)) = G(R{X), (x, r)). 
More explicitly, the extractor is defined by the following procedure. 



Parameters: n G N, G N, A > 0, e > 0, ^ G N, m G N, satisfying the above requirements. 

Inputs: The weakly-random string X G {0,1}^, viewed as the truth-table of a function 
X :[N]^ [N]; the seed y = {x,r) G {0, 1}^" x {0, 1}^". 

Step 1. Transform X into a permutation Rx G PERM^. The transformation is performed by the 
above procedure which yields a permutation R G CIRC, and, next, Rx is the ^-direct product of R. 

Step 2. For i = to m — 1, bi = r ■ R^xi^)- 

Output boQbiQ ...Q bm—i, which is denoted E[X,y). 



We have defined a function E : {0,1}^ x {0,1}^^" {0,1}"". Note that the seed length 2in is 
0{log'^ N) and the output length m is , for an arbitrary a < A/3. 

Theorem 6.2 The function E is a {XN, 2e)- extractor. 

Proof Let C4 be a subset of {0,1}'". Taking into account Lemma 12.21 it is enough to show that 
the number of strings X G {0, 1}^ that do not hit C4 e-correctly via E is at most 2-^^-'°s(i/^). Let 
X G {0, 1}^ be a string that does not hit C4 e-correctly via E. By the definition of E, it follows that Rx 
does not hit C4 e-correctly via G. By Lemma l6. 11 there are at most 2™+'*+^ permutations R G PERM^ 
that do not hit C4 e-correctly via G, where h = 35Nn + 3A^logm + 3A^log(l/e). Since the number of 
functions X : [N] [N] that map into a given permutation R G PERM^ is at most A{N) < 2^^, it 
follows that 

\\{X G {0, 1}^ I X does not hit G4 e-correctly} || < 2^^ • 2'"+'*+i < 2^^-i°s(i/^), 
where the last inequality follows from the choice of parameters. | 

7 A bitwise locally-computable extractor 

We present a bitwise locally-computable extractor: Each bit of the output string can be calculated 
separately in 0(log^ A^), where A^ is the length of the weakly-random string. The proof uses the same 
plan as for the extractor in Sectional except that the weakly-random string X is viewed as the truth- 
table of an arbitrary function (not necessarily a permutation) and the "consecutive"' values that are 
used in the extractor are X(x),X{x + 1), . . . ,X{x + m — 1) (instead of R{x),R (x), . . . , pT^ (x) used 
in Section inj. 

The parameter n G N will be considered fixed throughout this section. We denote N = 2^ and 
N = n ■ N . The parameter m G N will be specified later (it will be a subunitary power of A^). For 
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two binary strings x and r of the same length, b{x, r) denotes the inner product of x and r viewed as 
vectors over the field GF(2). 

The weakly-random string X has length N, and is viewed as the truth-table of a function X : 
{0, 1}" {0, 1}". For some i e N that will be specified later we define X : {0, 1}^" {0, 1}^" by 
X{xiQ. . .Qxi) = X{xi)Q. . .QX{xi), i.e., X is the ^-direct product of X. We also denote x = xiQ. . .Qxi. 
The seed of the extractor will be {x, r) £ {0, l}^*^ x {0, 1}^". We define x + I = {xi + 1) Q . . . Q {xe + I) 
(where the addition is done modulo 2") and inductively, for any k£'N,x + k + l = {x + k) + l. The 
extractor is defined by 

E{X,{x,r)) = b(X{x),r)Qb(X{x+l),r) Q . . . Q b(X{x + m - l),r). (4) 

A set D C {0, 1}™ is called a test. We say that X hits a test D e-correctly via E if \Piohx^r{E{X, (x, r)) G 
D) — |||o^x'jL| | I ^ £• We want to show that the number of functions X that do not hit D e-correctly via 
E is small and then use Lemma [2. 21 To this aim we investigate the properties of a function X that does 
not hit a test D C {0, l}*" e-correctly via E. 

Lemma 7.1 Let D C {0, 1}*" be a fixed set. Then there are 2™+^ — 4 circuits Ci, . . . C2m+2_4 such that 
if X does not hit D e-correctly, then there is some circuit d, i G {1, . . . , 2"^"*"^ — 4}, such that 

Fiohx^r{Ci on input b{X{x — m + l),r) Q . . . Q b{X(x — 1), r) outputs b{X{x),r)) > 1/2 + e/m. 

Proof The proof is similar to the proofs of lemmas l5 . II and l5 . 21 Let X : {0, 1}" {0, 1}" be a function 
that does not hit D e-correctly via E. This means that \Frohx,r{E{X, (x, r)) G D) — Proh{Um G D)\ > e. 
Let us first suppose that 

Prob5;,r(^(X, {x, r)) € D) - Prob([/„ G D) > e. (5) 
For each k G {0, . . . ,m — 1}, we define the hybrid distribution d/. given by 

4 = b(X{x),r) b(X{x + l),r)Q...Q b(X{x + m- k - l),r) Q Uk- 

and 

dm — Um- 

Equation © states that Prob(do G -D) — Vioh{dm £ D) > e. Using the standard argument, it follows 
that there exists k G {0, ... ,m — 1} such that Prob((ifc £ D) — Prob((ifc+i £ D) > e/m. We build a 
probabilistic circuit C that on input b{X{x),r) b{X{x + 1), r) ... b{X{x + m — k — 2),r) attempts 
to calculate b{X{x + m — k — l),r). 



Circuit C. 

Input: vq Q vi Q . . . Q Vm-k-2, each Vi G {0, 1}. (In case k = m — 1, there is no input.) 

Choose randomly u € {0,l},t £ {0, 1}''. 

If 1)0 fi ... Vm-k-2 QuQt £ D, return u. 
Else return 1 — u. 



By Yao's lemma on predictors versus distinguishers, it holds that 
Fiohx,riC on input b(X{x),r) ... 6(X(x + m - k-2),r) outputs 6(X(x + m - - l),r)) > 1/2-he/m, 
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where the probabihty is taken over x G {0, 1} , r S {0,1} and the random bits used by C. The 
procedure C uses k + 1 random bits (for u and t), and /c G {0, . . . , m — 1}. By considering ah possibihties 
for k and by fixing the {k + 1) bits in all possible ways, we obtain 2^ + . . . + 2*" = 2"^+^ - 2 circuits 
Ci, . . . , C2m+i_2 with the desired property. 
In the alternative (to Equation 0) case 

Prob(f/„ eD)- Prob(-E(X, (x, r)) G D) > e, 

we obtain in a similar way another set of 2™""*"^ — 2 circuits. I 

The next lemma is an analogue of Lemma 15.31 It states that if there exists a circuit C with the 
property indicated in Lemma l7. 11 then, from X{x — m + 1), . . . , X(x — 1), one can compute, in a weak 
but non-trivial way, X{x). 

Lemma 7.2 Let C he a circuit. Then there is a circuit B such that the following holds. Suppose 
X : {0, 1}^" {0, 1}^" is a function such that 

Vioh^^r{C{b(X{x - m + 1), r) . . . h{X{x - 1), r)) = b{X{x, r))) > (1/2) + (e/m). 

Then, for at least a fraction e/m ofx in {0, 1}^", B on input X(x — m + 1) 0... Q X{x — 1) outputs a 
list of rn^ ■ (1/e^) elements, one of which is X{x). 

Proof The proof is similar to the proof of Lemma 15.31 Let C and X be as in the hypothesis. By an 

averaging argument, it follows that for a fraction (e/m) of x in {0, 1}^"", 

Probr(C(6(X(x - m + 1), r) . . . h(X{x - 1), r) = 6(X(x, r))) > (1/2) + e/(2m). (6) 

Let Had(x) denote the encoding of a string x via the Hadamard error-correcting code. By the definition 
of the Hadamard code, h{x, r) is just the r-th bit of IIad(3;). Consider the binary string uix) G {0, 1}^ " 
whose r-th bit is C(6(X(x — m -|- 1), r) . . . 6(X(x — 1), r) (here r G {0, . . . , 2^" — 1} is written in 
base 2 on In bits for the sake of the definition of b). Clearly, the string u{x) can be calculated from 
X{x — m + 1), . . . , X{x — 1). The equation © implies that, for a fraction (e/m) of x G {0, 1}^"", uix) 
agrees with Had(X(x)) on at least 1/2 -|- e/(2m) positions. By brute force, we can determine all the 
strings z so that IIad(2;) agrees with u{x) in at least ^ -|- e/(2m) positions. It is known that there are 
at most i • (^)^ = (y)^ such strings z and, by the above discussion, one of them is X{x). | 

The key property of the circuit B in the above lemma is captured in the following definition (which 
is analogous to Definition 14. 2|) . 

Definition 7.3 Let B he a circuit. A function X : {0,1}^" {0,1}^" is {'y,T)-good for B if for at 
least a 7 fraction ofx^ {0, 1}^", B on input X{x — m -|- 1) ... X{x — 1) outputs a T- list of strings, 
one of which is X{x). 

We choose the parameters in the same way as in Section^ The parameters e and m will be specified 
later. We take <5 > 0, 7 = e/m, T = rn^/e^, i = [(3/<5) log(2/7)] and w = \6 ■ {1/6) ■ log(2/7) • (1/7)]. 

The next two lemmas are the analogues of Lemma 14.31 The first lemma shows the amplification 
effect obtained by taking the ^-direct product. 

Lemma 7.4 The parameters are as specified ahove. Let B he a circuit. Then there is an oracle circuit 
A such that: 

(1) If X is {'j,T)-good for B, then, for a fraction (1 — 5) of x in {0, 1}", the circuit A, on input x 
and X{x — m -|- 1) ... X{x — 1) and with access to oracle X restricted as shown in (2), outputs a 
list containing n ■ w ■ T elements, one of which is X{x). 

(2) The oracle circuit A queries a set of n ■ w ■ (i — 1) ■ (m — 1) strings that do not depend on the 
input. 
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Proof The proof is very similar to the first part of the proof of Lemma 14.31 Let GOOD be the set of 
strings x in {0, 1}^" such that the circuit B, on input X(x — m + 1) ... X{x — 1), calculates a T-list 
that contains X(x). By hypothesis, ||GOOD|| > 7 • 2^". We consider the following algorithm A' that 
can query the oracle X : {0, 1}" — > {0, 1}" in several random positions. 

Input: X E {0,1}", and X(x-m + l), . . . 1) G ({0,1}")'"-^ The algorithm can pose random 

queries to the oracle X : {0, 1}" {0, 1}". The goal is to calculate a list of strings that contains X{x). 
LIST = 0. 

Repeat the following n • w times. 
Pick random i E {1, ...,£}. 

Pick £ — 1 random strings in {0, 1}" denoted xi, . . . , Xj+i, . . . , X£. 

By querying the oracle X, find, for each Xj, the strings X(xj —m + 1), — m + 2), . . . , X{xj — 1). 
Let X = (xi, . . . , X, Xj+i, . . . , x^). Build the string X(x — m + 1) 0X(x — m + 2) . . . 0X(x — 1). 
Run the circuit B on input X{x — m + 1) X{x — m + 2) ... X(x — 1). 
The circuit B returns a T-list of ^-tuples in ({0, 1}")^. 
(Note: In case of success, one of these ^-tuples is 
X{x) = . . . , X{xi.i),X{x),X{x,+i) X{xi)) 

Add to LIST the i-ih component of every ^-tuple in the list produced by B. 
End Repeat 



We say that the above algorithm is successful on input x if, at the conclusion of the algorithm, 
LIST contains X{x). We estimate the success probability of the above circuit on input x. Let N{x) 
be the multiset of ^-tuples having x as one component where the multiplicity of a tuple is the number 
of occurrences of x in the tuple. On input x, at each iteration, the algorithm chooses uniformly at 
random x in N{x). The algorithm succeeds at that iteration if and only if x G GOOD. By following 
the same arguments and the same calculations as in Lemma 14.31 we conclude that the probability 
that algorithm A' succeeds on x is at least (1 — 6), where the probability is taken over x and the 
random strings used by A' . Note that the algorithm A' is using at each iteration the random strings 
xi, . . . , Xj_i, Xj+i, . . . , X£, and there are n ■ w iterations. For each such random string xj, A' needs the 
(m — 1) values X{xj — m + \)^X{xj — m + 2), . . . , X{xj — 1). There is a way to fix the above random 
strings so that the circuit A, which results from A' by using the fixed strings instead of the random 
strings, succeeds on at least a (1 — (5) fraction of the strings x G {0, 1}". Therefore, the circuit A has 
the desired properties. I 

Lemma 7.5 The parameters are as specified above. Let A be an oracle circuit and X : {0, 1}" {0, 1}*^ 
be a function such that A and X satisfy the conditions (1) and (2) in Lemma \7.4\ More precisely, we 
assume that: 

(1) For a fraction {1 — 6) of x in {0, 1}", the circuit A, on input x and X(x — m+l)0. . .QX{x— 1) 
and with access to oracle X restricted as shown in (2), outputs a list containing n- w -T elements, one 
of which is X{x). 

(2) The oracle circuit A queries a set of n ■ w ■ {i — 1) ■ (m — 1) strings that do not depend on the 
input. 

Then, given A, X can be described using a number of bits bounded by 26Nn + run + N log n + 
(log6)iV + A^log(l/(^) +A^loglog(2/7) + iVlog(l/7) + iVlogr + 36n2 - m- i • (i)^(log|)^ 

Proof The oracle circuit A allows a short description of the strings -'^(x) for the fraction of (1 — 5) of 
the strings x G {0,1}" given in assumption (1). Namely, such a string ^(x) is completely determined 
by the circuit A, by the value of X for the fixed set of queries given in assumption (2), by the previous 
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m — 1 values X{x — m + 1), . . . , X{x — 1), and by the rank of X{x) in the hst returned by A on input 
x,X{x - m + 1), . . .,X{x - 1). Thus, the truth-table of the function X : {0, 1}" {0, 1}" can be 
described (given the circuit A) using the following information. 

• 26 ■ N ■ n bits that encode the set of 5N elements on which A fails and the value of X at these 
points. 

• The "first" m — 1 values ^(0), . . . ,X{m — 1). This information requires (m — 1) • n bits. (Here, 
X{i) represents the value of X at the i-th string in {0, 1}", lexicographically ordered.) 

• For each of the (1 — 5)N strings y on which A succeeds, the rank of X(x) in the list returned by 
A. This requires {1 — 5) • N • (logn + logw + logT) bits. 

• The set n ■ w ■ {i — 1) ■ {m — 1) fixed strings that are queried by A and the value of X at these 
strings. This information requires 2n? ■ w ■ {£ — 1) ■ (m — 1) bits. 

The total number of bits needed for the description of X (given A) is bounded by 

26Nn + m • n + (1 - d)N log n + (1 - 6)N log u; + (1 - 6)N log T + 2n'^ ■ w {£ - 1) ■ {m - 1). 

Keeping into account that £ = [(3/5) log(2/7)] and w = \6 ■ (1/6) ■ log(2/7) • (I/7)], the conclusion 
follows. I 

We make the final choice of parameters. Let n G N and the constant A G (0, 1). Recall that N = 2"^ 
and N = n- N. We take the constants a < A/3, /3 < (A - 3a)/4 and 5 = {X-3a- 4/3)/4. We also take 
the output length m < A^" and the extractor bias e > N~f^. Note that i = [(3/5) log(2m/e)] = 0{n). 

Theorem 7.6 Assume that the parameters N, A, m, £ and e satisfy the above requirements. Then the 
function E : {0, 1}^ x {0, l}^^"- {0, 1}™, given in Equation^ is a {XN, 2e)- extractor. 

Proof Assume X € {0, 1}^ does not hit a test D C {0, 1}™ e-correctly via E. Then X can be described 
by one of the circuits Ci, . . . , C2m+2_4, given by Lemma [7m and, according to Lemma [731 by a string 
of length /i, where h < 2dNn + mn + N\ogn + (log6)iV + N\og{l/6) + iVloglog(2/7) + iVlog(l/7) + 
log T + 36n^ • m • ^ " (^)^ ( log ^)^. Thus, the number of strings X that do not hit D e-correctly via E is 
bounded by 2"^+^+'^. For our choice of parameters, it holds that m + 2 + h < XN — log(l/e). Therefore, 
by Lemma [T^ £^ is a (AA'', 2e)-extractor. | 

The construction scheme of the last extractor (given in Equation @) allows some flexibility in 
the choice of parameters and, in particular, we can obtain an extractor with seed length logarithmic 
in the length of the weakly random string. Namely, we can consider the weakly random string X 
to be the truth-table of a function of type X : {0,1}" — > {0,1}^^, where A''i >> n. We use the 
same value of £, and we take the ^direct product of X and obtain X : {0, 1}^" {0, lY^K Clearly, 
|X(x)| = ^A''!. To get a short seed we need to replace the Hadamard code (recall that the function 
b{x,r) gives the r-th bit of Had(3;)) by an error-correcting code with a good list decoding property 
that has a better rate. For example the code given in |(THSZn2] . which we denote Code, is of the type 
Code : {0,1}"" — > {0,1}'^, with n = 0{n ■ (1/e)^), is computable in polynomial time, and it has the 
property that any ball of radius (1/2) -|- e has at most 0((l/e)^) codewords. Similarly to function b, 
we define the function c{x,r) = the r-th bit of Code(x), for x G {0, 1}^" and any binary string r with 
length \r\ = log(Code(x)) = log(^ • A'l • (1/e)*^) + 0(1). We define the extractor E' by 

E'{X, (x, r)) = c(X{x),r) c(X{x + 1), r) ... c(X{x + m - 1), r). (7) 

The analysis is very similar to that done for the previous extractor given in Equation For example, 
if we assume that e < 2(1/^)", take Ni = 2"' and m = 2(^3)"^ and we denote the length of X by A^ (i.e., 
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A?^ = 2" we obtain a quite simple extractor that has seed length 0(log(A?^)), is capable to extract 

from sources with min-entropy AA^, for arbitrary constant A > 0, and has output length ~ 2^^/^^'^^°^^^\ 
This extractor has a good seed length, however the output length is much smaller than the min-entropy 
of the source. 
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